6 research outputs found

    Algorithms and lower bounds for testing properties of structured distributions

    Get PDF
    In this doctoral thesis we consider various property testing problems for structured distributions. A distribution is said to be structured if it belongs to a certain class which can be simply described in approximation terms. Such distributions often arise in practice, e.g. log-concave distributions, easily approximated by polynomials (see [Bir87a]), often appear in econometric research. For structured distributions, testing a property often requires far less samples than for general unrestricted distributions. In this thesis we prove that this is indeed the case for several distance-related properties. Namely, we give explicit sub-linear time algorithms for L1 and L2 distance testing between two structured distributions for the cases when either one or both of them are available as a “black box”. We also prove that the given algorithms have the best possible asymptotic complexity by proving matching lower bounds in the form of explicit problem instances (albeit constructed using randomized techniques) demanding at least a specified amount of data to be tested successfully. As the main numerical result, we prove that testing that total variation distance to an explicitly given distribution is at least e requires O(√k/e²) samples, where k is an approximation parameter, dependent on the class of distribution being tested and independent of the support size. Testing that the total variation distance between two “black box” distributions is at least e requires O(k⁴/⁵e⁶/⁵). In some cases, when k ~ n, this result may be worse than using an unrestricted testing algorithm (which requires O( n²/3/e² ) samples where n is the domain size). To address this issue, we develop a third algorithm, which requires O(k²/³e⁴/³ log⁴/³(n/k) log log(n/k)) and serves as a bridge between the cases of small and large domain sizes

    Testing Identity of Structured Distributions

    Get PDF
    We study the question of identity testing for structured distributions. More precisely, given samples from a {\em structured} distribution qq over [n][n] and an explicit distribution pp over [n][n], we wish to distinguish whether q=pq=p versus qq is at least ϵ\epsilon-far from pp, in L1L_1 distance. In this work, we present a unified approach that yields new, simple testers, with sample complexity that is information-theoretically optimal, for broad classes of structured distributions, including tt-flat distributions, tt-modal distributions, log-concave distributions, monotone hazard rate (MHR) distributions, and mixtures thereof.Comment: 21 pages, to appear in SODA'1

    Optimal Algorithms and Lower Bounds for Testing Closeness of Structured Distributions

    Get PDF
    We give a general unified method that can be used for L1L_1 {\em closeness testing} of a wide range of univariate structured distribution families. More specifically, we design a sample optimal and computationally efficient algorithm for testing the equivalence of two unknown (potentially arbitrary) univariate distributions under the Ak\mathcal{A}_k-distance metric: Given sample access to distributions with density functions p,q:IRp, q: I \to \mathbb{R}, we want to distinguish between the cases that p=qp=q and pqAkϵ\|p-q\|_{\mathcal{A}_k} \ge \epsilon with probability at least 2/32/3. We show that for any k2,ϵ>0k \ge 2, \epsilon>0, the {\em optimal} sample complexity of the Ak\mathcal{A}_k-closeness testing problem is Θ(max{k4/5/ϵ6/5,k1/2/ϵ2})\Theta(\max\{ k^{4/5}/\epsilon^{6/5}, k^{1/2}/\epsilon^2 \}). This is the first o(k)o(k) sample algorithm for this problem, and yields new, simple L1L_1 closeness testers, in most cases with optimal sample complexity, for broad classes of structured distributions.Comment: 27 pages, to appear in FOCS'1

    Near-Optimal Closeness Testing of Discrete Histogram Distributions

    Get PDF
    We investigate the problem of testing the equivalence between two discrete histograms. A {\em kk-histogram} over [n][n] is a probability distribution that is piecewise constant over some set of kk intervals over [n][n]. Histograms have been extensively studied in computer science and statistics. Given a set of samples from two kk-histogram distributions p,qp, q over [n][n], we want to distinguish (with high probability) between the cases that p=qp = q and pq1ϵ\|p-q\|_1 \geq \epsilon. The main contribution of this paper is a new algorithm for this testing problem and a nearly matching information-theoretic lower bound. Specifically, the sample complexity of our algorithm matches our lower bound up to a logarithmic factor, improving on previous work by polynomial factors in the relevant parameters. Our algorithmic approach applies in a more general setting and yields improved sample upper bounds for testing closeness of other structured distributions as well

    Investigation of laser modification and light-induced electric signals in y - ba - cu - o films

    No full text
    The experimental investigation of film electric and optic property variations for high temperature superconductors is the aim of the paper. As a result a model of anisotropic thermo-EMF, induced by the laser radiation has been developed. The determination simple methodology of dielectric permittivity and thickness of films has been suggested. The laser modification of high temperature superconducting films has been investigated, the applicability of the thermal model to it has been established. Non-stationary electric signals in Y - Ba - Cu - O films have been discovered and investigated. Results may find their field of application in optic detectors, moisture-free contact laser lithographyAvailable from VNTIC / VNTIC - Scientific & Technical Information Centre of RussiaSIGLERURussian Federatio
    corecore